Hoax Detection on Indonesian Tweets using Naïve Bayes Classifier with TF-IDF

نویسندگان

چکیده

Twitter is one of the most popular social media platforms in world nowadays. users Indonesia are fifth largest and always active expressing themselves getting information through tweets. A hoax a lie created as if it were true. Hoaxes also often spread via The hoaxes extremely dangerous because can cause discord even misunderstanding. Therefore, must be resisted. This study aims to build system detect on Indonesian objective this research identify tweets by using Naïve Bayes classifier with Term Frequency Inverse Document (TF-IDF). collects annotates from post which sent user account. applied several text preprocessing techniques provide datasets. To best prediction model, work splits datasets into training testing There four experimental scenarios that refer splitting dataset. results showed model TF-IDF had 64% accuracy recall, 69% 67% precision, F1-score respectively. result superior when without TF-IDF. It means has made positive contribution improving performance. Finally, contributes detecting news proclivity for filtering what classified or not.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Image Classification Using Naïve Bayes Classifier

An image classification scheme using Naïve Bayes Classifier is proposed in this paper. The proposed Naive Bayes Classifier-based image classifier can be considered as the maximum a posteriori decision rule. The Naïve Bayes Classifier can produce very accurate classification results with a minimum training time when compared to conventional supervised or unsupervised learning algorithms. Compreh...

متن کامل

Naïve Bayes Classifier Based Watermark Detection in Wavelet Transform

Robustness is the one of the essential properties of watermarking schemes. It is the ability to detect the watermark after attacks. A DWT-based semi-blind image watermarking scheme leaves out the low pass band, and embeds a pseudo random number (PRN) sequence (i.e., the watermark) in the other three bands into the coefficients that are higher than a given threshold T1. During watermark detectio...

متن کامل

Spam Detection System Combining Cellular Automata and Naïve Bayes Classifier

In this study, we focus on the problem of spam detection. Based on a cellular automaton approach and naïve Bayes technique which are built as individual classifiers we evaluate a novel method combining multiple classifiers diversified both by feature selection and different classifiers to determine whether we can more accurately detect Spam. This approach combines decisions from three cellular ...

متن کامل

Semantic Naïve Bayes Classifier for Document Classification

In this paper, we propose a semantic naïve Bayes classifier (SNBC) to improve the conventional naïve Bayes classifier (NBC) by incorporating “document-level” semantic information for document classification (DC). To capture the semantic information from each document, we develop semantic feature extraction and modeling algorithms. For semantic feature extraction, we first apply a log-Bilinear d...

متن کامل

Boosting the Tree Augmented Naïve Bayes Classifier

The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the per...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Information System Research (JOSH)

سال: 2023

ISSN: ['2686-228X']

DOI: https://doi.org/10.47065/josh.v4i3.3317